Using Blob Detection in Missing Feature Linear-Frequency Cepstral Coefficients for Robust Sound Event Recognition

نویسندگان

  • Yi Ren Leng
  • Tran Huy Dat
چکیده

The proposed Missing Feature Linear-Frequency Cepstral Coefficients (MF-LFCC) is a noise robust cepstral feature that transforms both clean and noisy signals into a similar representation. Unlike conventional Missing Feature Techniques, the MF-LFCC does not require the substitution of spectrogram elements (imputation) or classifier modification (marginalization). To improve the noise mask used in the MF-LFCC, we propose to use the computer vision technique of blob detection to identify the peaks characterizing the sparsity of sound event spectrograms. For single sound event recognition using SVM classifiers, the MF-LFCC is shown to significantly outperform the MFCC baseline and the noise robust ESTI Advanced Front End feature in noisy conditions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Using Mel-Frequency Cepstral Coefficients in Missing Data Technique

Filter bank is the most common feature being employed in the research of the marginalisation approaches for robust speech recognition due to its simplicity in detecting the unreliable data in the frequency domain. In this paper, we propose a hybrid approach based on the marginalisation and the soft decision techniques that make use of the Mel-frequency cepstral coefficients (MFCCs) instead of f...

متن کامل

Selective gammatone filterbank feature for robust sound event recognition

This paper introduces a novel feature based on the raw output of the gammatone filterbank. Channel selection is used to enhance robustness over a range of signal-to-noise ratios (SNR) of additive noise. The recognition accuracy of the proposed feature is tested on a sound event database using a Hidden Markov Model (HMM) recogniser. A comparison with a series of similar features and the conventi...

متن کامل

DWT and LPC based feature extraction methods for isolated word recognition

In this article, new feature extraction methods, which utilize wavelet decomposition and reduced order linear predictive coding (LPC) coefficients, have been proposed for speech recognition. The coefficients have been derived from the speech frames decomposed using discrete wavelet transform. LPC coefficients derived from subband decomposition (abbreviated as WLPC) of speech frame provide bette...

متن کامل

Physiologically Motivated Feature Extraction for Robust Automatic Speech Recognition

In this paper, a new method is presented to extract robust speech features in the presence of the external noise. The proposed method based on two-dimensional Gabor filters takes in account the spectro-temporal modulation frequencies and also limits the redundancy on the feature level. The performance of the proposed feature extraction method was evaluated on isolated speech words which are ext...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012